-
Notifications
You must be signed in to change notification settings - Fork 60
feat: Add external plugins support - DO NOT MERGE #183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Maya Barnea <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not review beyond the first or second file.
IMO, this PR might be too big to review in one go and could benefit from breaking up (e.g.,):
- design doc/proposal (which should be approved separately)
- API definitions (e.g., JSON between client and server)
- sample server (e.g., python) and client, for a fixed function.
- improved client (e.g., configuration of http.Client for timeout handling, etc.)
There is no need to support all plugin types at once. Furthermore, since configuration via file is forthcoming, I would delay writing configuration code until it is merged. Otherwise it's throwaway code.
@@ -14,6 +14,7 @@ require ( | |||
github.com/prometheus/client_golang v1.22.0 | |||
github.com/redis/go-redis/v9 v9.7.3 | |||
github.com/stretchr/testify v1.10.0 | |||
github.com/valyala/fasthttp v1.62.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
q: what's the benefit of using a non standard HTTP implementation? If performance, then we should first prove that HTTP handling is a bottleneck.
@@ -52,15 +54,49 @@ const ( | |||
|
|||
prefixScorerBlockSizeEnvKey = "PREFIX_SCORER_BLOCK_SIZE" | |||
prefixScorerBlockSizeDefault = 256 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point the external process plugin is for demo only, so I would refrain from introducing and using all these environment variables (which would be eliminated by the config changes anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a demo, you can just enable (or not) a fixed type plugins (e.g., filter or scorer) at a well known address (e.g., localhost:8000/8088
and run their container(s) in the same Pod)
@@ -22,6 +22,7 @@ import ( | |||
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/epp/util/logging" | |||
|
|||
"github.com/llm-d/llm-d-inference-scheduler/pkg/config" | |||
externalhttp "github.com/llm-d/llm-d-inference-scheduler/pkg/scheduling/plugins/external/http" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import order seems wrong
@elevran what’s the intention about this PR? |
It is not intended to be merged at this time and was created to share ideas with community and to solicit feedback (the concept of out of process plugins was presented at a community meeting and had corresponding code for those interested). |
one option would be to create a branch called |
Agree. |
Add initial version of server side support for external HTTP plugins.
Add sample filter plugin implemented in python
Fixes: #163
Fixes: #164
Fixes: #184